This is a PLACEHOLDER. WORK IN PROGRESS... PROCEED AT YOUR OWN RISK.

Seriously though, I just copied an old notebook here to test things. I will update soon.

  • Andy. 13 Feb, 2021

OVERVIEW

This is a project initiated while an Insight Data Science fellow. It grew out of my interest in making data driven tools in the fashion/retail space I had most recently been working. The original over-scoped idea was to make a shoe desighn tool which could quickly develop some initial sneakers based on choosing some examples, and some text descriptors. Designs are constrained by the "latent space" defined (discovered?) by a database of shoe images. However, given the 3 week sprint allowed for development, I pared the tool down to a simple "aesthetic" recommender for sneakers, using the same idea of utilizing an embedding space defined by the database fo shoe images.

Part 0: DATA

The data has been munged... link to details here [01_data.ipynb]

Part 3: ResNet feature extractor

embed database into feature space.

evaluate by simple logistic regression on classification.

FastAI getting started

filename = "zappos-50k-simplified_sort"
df = pd.read_pickle(f"data/{filename}.pkl")
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-5-3b607af4f74a> in <module>
      1 filename = "zappos-50k-simplified_sort"
----> 2 df = pd.read_pickle(f"data/{filename}.pkl")

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/pandas/io/pickle.py in read_pickle(filepath_or_buffer, compression)
    167     if not isinstance(fp_or_buf, str) and compression == "infer":
    168         compression = None
--> 169     f, fh = get_handle(fp_or_buf, "rb", compression=compression, is_text=False)
    170 
    171     # 1) try standard library Pickle

~/anaconda3/envs/fastbook/lib/python3.8/site-packages/pandas/io/common.py in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors)
    497         else:
    498             # Binary mode
--> 499             f = open(path_or_buf, mode)
    500         handles.append(f)
    501 

FileNotFoundError: [Errno 2] No such file or directory: 'data/zappos-50k-simplified_sort.pkl'

next step: extract features from resnet v2

torchvision & FastAI

We will use Google's mobileNetV2 trained on ImageNet loaded from torchvision to embed our sneakers into a feature space.

We'll get the pretrained net via torchvision, and then neuter the classification/pooling "head".

This feature space will be how we find "similar" sneakers

import torchvision

Because we simply want to collect the features output from the model rather than do classification (or some other decision) I replaced the clasiffier head with a simple identity mapper. The simple Identity nn.Module class makes this simple.

Finally, since we are calculating the features, or embedding over 30k images with the net lets load the computations onto our GPU. We need to remember to do this in evaluation mode so batch Norm and dropout layers are disabled. [I forgot to do this initally and lost hours trying to figure out why i wasn't getting consistent results]. Setting param.requires_grad = False saves us memory since we aren't going to fit any weights for now, and protects us in case we forget to do a with torch.no_grad() before inference.

ASIDE:  I'm running the compute on what I call my data-pizza oven: a linux machine loaded wiht a powerful CPU a cheap (but powerful GPU), and a bunch of memory in the gutted shell of an old PowerMac G5 case.  (I picked up at a garge sale for $25 bucks! I call it the BrickOven Toaster.  Check it out [here]

Later when we use the full FastAI API this should all be handled elegantly behind the scenes

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")


device
def get_ResNet_feature_net(to_cuda=False):
    
    # following the pattern for MnetV2 but could use the fastai resnet instead (just need to remove fc)
    resnet  = torchvision.models.resnet50(pretrained=True) 
    num_ftrs = resnet.fc.in_features
    print(num_ftrs)
    
    resnet.fc = Identity()
    
    if to_cuda:
        device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    else:
        device = torch.device("cpu")
    resnet = resnet.to(device)
    resnet.eval()

    # just incase we forget the no_grad()
    for param in resnet.parameters():
        param.requires_grad = False
        
    return resnet



rnet = get_ResNet_feature_net(to_cuda=True)
2048
rnet
ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): Bottleneck(
      (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
    (2): Bottleneck(
      (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
  )
  (layer2): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): Bottleneck(
      (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
    (2): Bottleneck(
      (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
    (3): Bottleneck(
      (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
  )
  (layer3): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): Bottleneck(
      (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
    (2): Bottleneck(
      (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
    (3): Bottleneck(
      (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
    (4): Bottleneck(
      (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
    (5): Bottleneck(
      (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
  )
  (layer4): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): Bottleneck(
      (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
    (2): Bottleneck(
      (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
    )
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
  (fc): Identity()
)
batch_size = 128
def get_x(r): return path_images/r['path']
#def get_y(r): return r['Category']  # we aren't actually using the category here (see 02_model.ipynb)
def get_fname(r): return r['path']

def get_dls(data,batch_size, size, device):
    # put everythign in train, and don't do any augmentation since we are just going 
    # resize to resize and normalize to imagenet_stats
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                   splitter=IndexSplitter([]),
                   get_x=get_x, 
                   get_y=get_fname,
                   item_tfms=Resize(size, method='pad', pad_mode='border'),
                   batch_tfms=Normalize.from_stats(*imagenet_stats))  # border pads white...
    
    dls = dblock.dataloaders(data,bs=batch_size,drop_last=False,device=device)
    #since we are just calculating the features for all the data turn off shuffling
    dls.train.shuffle=False
    return dls
def get_all_feats(dls,conv_net):
    vects = []
    clss = []
    paths = []
    batchn = 0

    for imgs,classes in dls.train:
        with torch.no_grad():
            outs = conv_net(imgs)
        vects.extend(list(outs.data.cpu().numpy()))
        cs = classes.data.cpu().numpy()
        clss.extend(list(cs))
        ps = [dls[0].vocab[c] for c in cs]
        # keep the paths for sanity check
        paths.extend(ps)
        batchn += 1

        #store all relevant info in a pandas datafram 
    df_feats = pd.DataFrame({"path": paths, "classes":clss, "features":vects})
    return df_feats
for i,sz in enumerate(IMG_SIZES):
    print(IMG_SIZES[sz])
    dls = get_dls(df,batch_size,IMG_SIZES[sz],device)
    df_f = get_all_feats(dls,rnet)
    # save it
    filename = f"resnet50-features_{sz}"
    df_f.to_pickle(f"data/{filename}.pkl")
128
160
224
filename = f"resnet50-features_small"
df_sm = pd.read_pickle(f"data/{filename}.pkl")
filename = f"resnet50-features_medium"
df_md = pd.read_pickle(f"data/{filename}.pkl")
filename = f"resnet50-features_large"
df_lg = pd.read_pickle(f"data/{filename}.pkl")
df_test = pd.merge(df_sm,df_md,how='left',on='path',suffixes=('_sm','_md'))
df_test = pd.merge(df_test,df_lg,how='left',on='path')
df_test = df_test.rename(columns={"classes": "classes_lg", "features": "features_lg"})
# explicitly:
df2 = pd.merge(df, df_test,  how='left', on='path')
filename = "zappos-50k-resnet50-features_"
df2.to_pickle(f"data/{filename}.pkl")

df2 = df2.sort_values('path', ascending=True)
df2 = df2.reset_index(drop=True)
df2.head(3)
CID Category path path_and_file Category1 Category2 Filename Sneakers Boots Shoes Slippers Adult Gender train test validate t_t_v classes_sm features_sm classes_md features_md classes_lg features_lg
0 7965307-5291 Boots Boots/Ankle/A. Testoni/7965307.5291.jpg [Boots, Ankle, A. Testoni, 7965307.5291.jpg] Boots Ankle 7965307.5291.jpg False True False False True Men True False False train 0 [0.695223, 1.7930202, 1.0035502, 0.26900926, 0.77631474, 0.18127622, 0.4640248, 0.14580336, 0.038228, 0.15156388, 0.31618097, 0.081923395, 0.13348874, 0.19519654, 1.3594868, 0.3594122, 0.41652876, 0.063587494, 0.10511994, 0.16083154, 0.010062125, 0.035535514, 0.0, 0.59914863, 1.6793615, 1.8515773, 0.071218565, 0.062255636, 0.24416742, 0.035477072, 0.30810866, 0.24137065, 0.42983437, 0.0, 0.13188893, 0.6486902, 0.17313416, 1.0713903, 0.43068004, 1.0163046, 0.9394236, 0.0930052, 0.010351628, 1.6052494, 3.3043222, 0.19216353, 0.17778453, 0.8192489, 0.5903719, 0.0636349, 0.0, 0.38167027, 0.195... 0 [1.0293803, 1.2834332, 0.58376735, 0.70071197, 0.68758905, 0.23635337, 0.5599384, 0.2179549, 0.08746172, 0.12725295, 0.25973606, 0.09216639, 0.08492168, 0.08973566, 1.0223373, 0.24873161, 0.24335705, 0.07549884, 0.035175696, 0.14700614, 0.08301474, 0.056742776, 0.18403263, 0.5523358, 1.4274023, 1.5965426, 0.2232186, 0.044663172, 0.04197434, 0.051594503, 0.41979834, 0.4969689, 0.5902567, 0.0, 0.123323135, 0.3028539, 0.13014045, 0.864832, 0.46151492, 0.32000494, 0.46424052, 0.27786705, 0.023625135, 1.5264032, 2.2251322, 0.023728594, 0.5822285, 1.0886811, 0.616793, 0.055082973, 0.010253302, 0... 0 [1.0525558, 0.5699274, 0.5164214, 0.3558127, 0.17844105, 0.14286493, 0.47167313, 0.3073482, 0.1863951, 0.10024566, 0.14276792, 0.0937418, 0.06538355, 0.22614212, 0.65536684, 0.22289571, 0.23442319, 0.026561262, 0.017496113, 0.07770841, 0.08895712, 0.08276087, 0.20495798, 0.40733135, 0.9723315, 1.1336758, 0.19190453, 0.21337967, 0.059072662, 0.07564141, 0.24598226, 0.21479385, 0.29471874, 0.026207875, 0.24777159, 0.16748981, 0.26878336, 0.48040366, 0.37134248, 0.3397534, 0.3728432, 0.46671918, 0.062789485, 0.7309181, 1.615623, 0.26427287, 0.43753043, 0.8896425, 0.3103263, 0.08585971, 0.0349...
1 7999255-363731 Boots Boots/Ankle/A. Testoni/7999255.363731.jpg [Boots, Ankle, A. Testoni, 7999255.363731.jpg] Boots Ankle 7999255.363731.jpg False True False False True Men True False False train 1 [0.8254825, 1.1487732, 1.6521342, 0.07282084, 0.58449376, 1.1889775, 0.61492133, 0.018431047, 0.02353911, 0.601611, 0.39381889, 0.13686427, 0.18034336, 0.07003789, 0.5791075, 1.0267832, 0.53254163, 0.35352004, 0.094872646, 0.07331787, 0.023302015, 0.054874882, 0.0058894963, 1.2221556, 1.0651183, 2.0309439, 0.10693102, 1.5398501, 0.066531844, 0.0, 0.55425155, 0.07603164, 0.08935942, 0.0, 0.24330615, 0.28022346, 0.39257056, 0.7341695, 0.16396596, 0.14372738, 0.7483506, 1.2399883, 0.042808414, 0.31384897, 4.0396047, 0.55289495, 0.031924736, 0.4193166, 0.09055334, 0.08121252, 0.0, 0.85472643, ... 1 [0.9259285, 1.0908284, 1.430608, 0.029291796, 0.57958436, 0.6924025, 0.5671978, 0.022952417, 0.08634404, 0.35303625, 0.22579615, 0.0999476, 0.13299958, 0.06760275, 0.7443761, 0.76487845, 0.27864, 0.27400163, 0.020948375, 0.06613588, 0.016227646, 0.11537536, 0.038774062, 0.48943946, 0.9444595, 1.6653934, 0.25205266, 1.1133574, 0.023286209, 0.0, 0.25283104, 0.09594786, 0.123159274, 0.0, 0.37718862, 0.24991135, 0.27959156, 0.43097728, 0.3442312, 0.26166633, 0.8507696, 1.1289045, 0.043799303, 0.31953156, 3.5089896, 0.81888366, 0.04169264, 0.2905425, 0.2506665, 0.056957774, 0.0005725324, 0.9142... 1 [0.8600705, 0.67179435, 0.98276657, 0.023832675, 0.5441491, 0.39453912, 0.3959064, 0.0068998626, 0.17407154, 0.32249114, 0.12689482, 0.17992046, 0.22028764, 0.04148321, 0.6194944, 0.5274008, 0.1352144, 0.26893187, 0.036576428, 0.035781495, 0.022336753, 0.16065429, 0.049976014, 0.22837238, 0.74674946, 1.2403986, 0.33703822, 0.7816749, 0.026859445, 0.001961927, 0.21754758, 0.21592835, 0.09322834, 0.018650655, 0.4739535, 0.39595515, 0.29090238, 0.47756946, 0.24316616, 0.22906233, 0.7168653, 1.197715, 0.058417372, 0.4822094, 2.861962, 0.43566582, 0.1172906, 0.5140327, 0.14387058, 0.083374314, ...
2 8000978-364150 Boots Boots/Ankle/A. Testoni/8000978.364150.jpg [Boots, Ankle, A. Testoni, 8000978.364150.jpg] Boots Ankle 8000978.364150.jpg False True False False True Men False True False test 2 [0.9675429, 1.14108, 1.6725609, 0.016882937, 0.6471273, 0.26466095, 1.0280257, 0.079707496, 0.06401573, 0.090719536, 0.35723186, 0.27454817, 0.12538533, 0.16452205, 1.4862713, 0.77559686, 0.6336874, 0.2203532, 0.35093367, 0.0835263, 0.0044855047, 0.047002614, 0.48931178, 0.2863015, 1.6296788, 1.7442963, 0.66180766, 0.7284249, 0.023850996, 0.012899914, 0.7672378, 0.063831374, 0.26240563, 0.0, 0.05801686, 1.4557496, 0.25936204, 1.421109, 0.35285187, 0.75093716, 0.6504662, 0.06395619, 0.026012419, 1.2120332, 3.9521077, 0.021970276, 0.05831966, 0.6977967, 0.040132295, 0.08705363, 0.0, 0.003519... 2 [0.7213375, 1.2112757, 1.6296287, 0.18360911, 0.7311356, 0.19083418, 0.9556018, 0.004747577, 0.022439618, 0.06053757, 0.21906841, 0.14055142, 0.095975466, 0.22564708, 1.134425, 0.682024, 0.4616003, 0.16225073, 0.07984422, 0.061635215, 0.022600118, 0.0, 0.5003714, 0.21897052, 1.4257247, 1.7200743, 0.96388334, 0.6484312, 0.009160582, 0.023438241, 0.77873015, 0.09568861, 0.43399352, 0.0, 0.21142943, 0.95776516, 0.47085154, 1.3204315, 0.3281566, 0.8693131, 0.53623295, 0.17682554, 0.072081864, 1.07117, 3.192131, 0.15220758, 0.35745588, 0.59527105, 0.13185023, 0.05756799, 0.0, 0.23932736, 0.5345... 2 [0.87329865, 0.6903132, 0.9645563, 0.04333784, 0.26954386, 0.07913532, 0.7891978, 0.02918885, 0.19647738, 0.060900662, 0.11001269, 0.28820124, 0.0900794, 0.04956818, 0.74877477, 0.5232616, 0.14983125, 0.1997602, 0.020430528, 0.053365644, 0.02573183, 0.03040605, 0.35867074, 0.19600782, 0.8301224, 1.1236007, 0.6034897, 0.37960738, 0.06967516, 0.0013589633, 0.6668422, 0.1858319, 0.5123048, 0.0003822154, 0.14619805, 0.51586753, 0.29248375, 1.0453777, 0.28488636, 0.4060111, 0.2758356, 0.31236893, 0.058438223, 0.84203416, 2.1782475, 0.12507407, 0.1516899, 0.6433779, 0.24389586, 0.05179703, 0.024...
filename = "zappos-50k-resnet50-features_sort_3"
df2.to_pickle(f"data/{filename}.pkl")
df = df2

If we've already calculated everything just load it.

SANITY CHECK: Can we extract single features that match those we just calculated?

query_image = "Shoes/Sneakers and Athletic Shoes/Nike/7716996.288224.jpg"

query_ind = df[df["path"]==query_image].index

#df[df['path']==query_image]
df.loc[query_ind,['path','classes_sm']]
path classes_sm
27079 Shoes/Sneakers and Athletic Shoes/Nike/7716996.288224.jpg 27079

The DataBlock performed a number of processing steps to prepare the images for embedding into the MobileNet_v2 space (1280 vector). Lets confirm that we get the same image and MobileNet_v2 features.

base_im = PILImage.create(path_images/query_image)
#BUG: pass split_idx=1 to avoid funny business
img = Resize(IMG_SIZE, method='pad', pad_mode='border')(base_im, split_idx=1)
t2 = ToTensor()(img)
t2 = IntToFloatTensor()(t2)
t2 = torchvision.transforms.Normalize(*imagenet_stats)(t2)
t2.shape
(3, 160, 160)

That seemed to work well. I'll just wrap it in a simple function for now, though a FastAI Pipeline might work the best in the long run.

def load_and_prep_sneaker(image_path,size=IMG_SIZE,to_cuda=False):
    """input: expects a Path(), but string should work
    
        output TensorImage ready to unsqueeze and "embed"
    TODO:  make this a Pipeline?

    """
    base_im = PILImage.create(image_path)
    #BUG: pass split_idx=1 to avoid funny business
    img = Resize(size, method='pad', pad_mode='border')(base_im, split_idx=1)
    t2 = ToTensor()(img)
    t2 = IntToFloatTensor()(t2)
    t2 = torchvision.transforms.Normalize(*imagenet_stats)(t2)
    
    if to_cuda:
        device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    else:
        device = torch.device("cpu")
        
    return t2.to(device)

path_images/query_image
    
Path('/home/ergonyc/.fastai/data/ut-zap50k-images/Shoes/Sneakers and Athletic Shoes/Nike/7716996.288224.jpg')
def get_convnet_feature(cnet,t_image,to_cuda=False):
    """
    input: 
        cnet - our neutered & prepped (resnet or MobileNet_v2)
        t_image - ImageTensor. probaby 3x224x224... but could be a batch
        to_cuda - send to GPU?  default is CPU (to_cuda=False)
    output: 
        features - output of mnetv2vector n-1280 
    """
    
    # this is redundant b ut safe
    if to_cuda:
        device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    else:
        device = torch.device("cpu")
   
    cnet = cnet.to(device)
    t_image.to(device)
        
    if len(t_image.shape)<4:
        t_image = t_image.unsqueeze(0)
        
    with torch.no_grad():
        features = cnet(t_image)
    
    return features
    
    
query_image2 = '/home/ergonyc/Downloads/491212_01.jpg.jpeg'

query_t = load_and_prep_sneaker(path_images/query_image)

#test_feats = get_mnet_feature(mnetv2,query_t)
test_feats = get_resnet_feature(rnet,query_t)
test_feats.shape
(1, 2048)

Now I have the "embeddings" of the database in the mobileNet_v2 output space. I can do a logistic regression on these vectors (should be identical to mapping these 1000 vectors to 4 categories (Part 3)) but I can also use an approximate KNN in this space to run the SneakerFinder tool.

Next up:

  1. make KNN functions.. maybe aproximate KNN e.g. Annoy for speed. Or precalculate .
  2. PCA / tSNE / UMAP the space with categories to visualize embedding
  3. make widgets to make this an actual tool / API

Lets find the nearest neighbors as a proxy for "similar"

I'll start with a simple "gut" test, and point out that thre realy isn't a ground truth to refer to. Remember that the goal of all this is to find some shoes that someone will like, and we are using "similar" as the aproximation of human preference.

Lets use our previously calculated sneaker-features and inspect that the k- nearest neighbors in our embedding space are feel or look "similar".

Personally, I like Jordans so I chose this as my query_image: Sample Jordan

k-Nearest Neighbors

from sklearn.neighbors import NearestNeighbors
import umap

def get_umap_reducer(latents):
    reducer = umap.UMAP(random_state=666)
    reducer.fit(latents)
    
    return reducer
num_neighs = 5

knns = []
reducers = []
for i,sz in enumerate(IMG_SIZES):
    print(ABBR[sz])
    print(IMG_SIZES[sz])
    
    features = f"features_{ABBR[sz]}"
    print(features)
    
    db_feats = np.vstack(df[features].values)
    
    neighs = NearestNeighbors(n_neighbors=num_neighs) #add plus one in case image exists in database
    neighs.fit(db_feats)
    
    knns.append(neighs)
    
    reducer = get_umap_reducer(db_feats)
    reducers.append(reducer)
    
    
sm
128
features_sm
md
160
features_md
lg
224
features_lg

Lets take a quick look at the neighbors according to our list:

neighs = knns[0]
distance, nn_index = neighs.kneighbors(test_feats, return_distance=True)    
dist = distance.tolist()[0] 

df.columns
Index(['CID', 'Category', 'path', 'path_and_file', 'Category1', 'Category2',
       'Filename', 'Sneakers', 'Boots', 'Shoes', 'Slippers', 'Adult', 'Gender',
       'train', 'test', 'validate', 't_t_v', 'classes_sm', 'features_sm',
       'classes_md', 'features_md', 'classes_lg', 'features_lg'],
      dtype='object')
paths = df[['path','classes_sm','classes_md','classes_lg']]
neighbors = paths.iloc[nn_index.tolist()[0]].copy()
images = [ PILImage.create(path_images/f) for f in neighbors.path] 
#PILImage.create(btn_upload.data[-1])
for im in images:
    display(im.to_thumb(IMG_SIZE,IMG_SIZE))
          
# img_row = df['path'].values[nn_index[0]]
# img_row = np.insert(img_row, 0, query_image, axis=0)

type(neighs)
sklearn.neighbors._unsupervised.NearestNeighbors
def query_neighs(q_feat, myneighs, data, root_path, show = True):
    """
    query feature: (vector)
    myneighs:  fit knn object
    data: series or df containing "path"
    root_path:  path to image files
    """
    distance, nn_index = myneighs.kneighbors(q_feat, return_distance=True)  
    dist = distance.tolist()[0] 

    # fix path to the database...
    neighbors = data.iloc[nn_index.tolist()[0]].copy()
    images = [ PILImage.create(root_path/f) for f in neighbors.path] 
    #PILImage.create(btn_upload.data[-1])
    if show:
        for im in images: display(im.to_thumb(IMG_SIZE,IMG_SIZE))

    return images
        
        
feature_func = get_resnet_feature

similar_images = []
for i,sz in enumerate(IMG_SIZES):
    print(ABBR[sz])
    print(IMG_SIZES[sz])
    
    features = f"features_{ABBR[sz]}"
    print(features)

    query_t = load_and_prep_sneaker(path_images/query_image,IMG_SIZES[sz])
    #query_f = get_convnet_feature(mnetv2,query_t)
    query_f = get_convnet_feature(rnet,query_t)
    
    similar_images.append( query_neighs(query_f, knns[i], paths, path_images, show=False) )
     
    im = PILImage.create(path_images/query_image)
    display(im.to_thumb(IMG_SIZES[sz]))
sm
128
features_sm
md
160
features_md
lg
224
features_lg
   
def plot_sneak_neighs(images):
    ''' function to plot matrix of image urls. 
        image_urls[:,0] should be the query image
        
    Args: 
        images: list of lists
    
    return: 
        null
        saves image file to directory
    '''
    nrow = len(images)
    ncol = len(images[0])
    
    fig = plt.figure(figsize = (20, 20))

    num=0
    for row,image_row in enumerate(images):
        for col,img in enumerate(image_row):
    
            plt.subplot(nrow, ncol, num+1)
            plt.axis('off')
            plt.imshow(img);

            if num%ncol == 0: 
                plt.title('Query')

            if col>0: 
                plt.title('Neighbor ' + str(col))
            num += 1
    plt.savefig('image_search.png')
    plt.show()
        
plot_sneak_neighs(similar_images)
similar_images2 = []
for i,sz in enumerate(IMG_SIZES):
    print(ABBR[sz])
    print(IMG_SIZES[sz])
    
    features = f"features_{ABBR[sz]}"
    print(features)

    query_t = load_and_prep_sneaker(path_images/query_image2,IMG_SIZES[sz])
    #query_f = get_convnet_feature(mnetv2,query_t)
    query_f = get_convnet_feature(rnet,query_t)
    
    similar_images2.append( query_neighs(query_f, knns[i], paths, path_images, show=False) )

    im = PILImage.create(path_images/query_image2)
    display(im.to_thumb(IMG_SIZES[sz]))
    
plot_sneak_neighs(similar_images2)
sm
128
features_sm
md
160
features_md
lg
224
features_lg

visualize the embedding: PCA + UMAP

df.columns
Index(['CID', 'Category', 'path', 'path_and_file', 'Category1', 'Category2',
       'Filename', 'Sneakers', 'Boots', 'Shoes', 'Slippers', 'Adult', 'Gender',
       'train', 'test', 'validate', 't_t_v', 'classes_sm', 'features_sm',
       'classes_md', 'features_md', 'classes_lg', 'features_lg'],
      dtype='object')
import seaborn as sns
from sklearn.decomposition import PCA
import umap

# first simple PCA
pca = PCA(n_components=2)

for i,sz in enumerate(IMG_SIZES):
    print(ABBR[sz])
    print(IMG_SIZES[sz])
    
    features = f"features_{ABBR[sz]}"
    print(features)
    
    data = df[['Category',features]].copy()

    db_feats = np.vstack(data[features].values)

    # PCA
    pca_result = pca.fit_transform(db_feats)
    data['pca-one'] = pca_result[:,0]
    data['pca-two'] = pca_result[:,1] 
    print(f"Explained variation per principal component (sz{sz}): {pca.explained_variance_ratio_}")

    smpl_fac=.5
    #data=df.reindex(rndperm)

    plt.figure(figsize=(16,10))
    sns.scatterplot(
        x="pca-one",
        y="pca-two",
        hue="Category",
        palette=sns.color_palette("hls", 4),
        data=data.sample(frac=smpl_fac),
        legend="full",
        alpha=0.3
    )
    plt.savefig(f'PCA 2-D sz{sz}')
    plt.show()
    
    
    # get the UMAP on deck
    embedding = reducers[i].transform(db_feats)
    
    data['umap-one'] = embedding[:,0]
    data['umap-two'] = embedding[:,1] 

    plt.figure(figsize=(16,10))
    sns.scatterplot(
        x="umap-one",
        y="umap-two",
        hue="Category",
        palette=sns.color_palette("hls", 4),
        data=data.sample(frac=smpl_fac),
        legend="full",
        alpha=0.3
    )
    plt.gca().set_aspect('equal', 'datalim')
    plt.title(f'UMAP projection of mobileNetV2 embedded UT-Zappos data (sz{sz})', fontsize=24)
    plt.savefig('UMAP 2-D sz{sz}') 
    plt.show()
sm
128
features_sm
Explained variation per principal component (szsmall): [0.1314021  0.09826419]
md
160
features_md
Explained variation per principal component (szmedium): [0.14228487 0.09591584]
lg
224
features_lg
Explained variation per principal component (szlarge): [0.17455636 0.10268292]
def get_umap_embedding(latents):
    reducer = umap.UMAP(random_state=666)
    reducer.fit(latents)
    embedding = reducer.transform(latents)
    assert(np.all(embedding == reducer.embedding_))
    
    return embedding
fn = df.path.values
type(db_feats)

snk2vec = dict(zip(fn,db_feats))

snk2vec[list(snk2vec.keys())[0]]

embedding = get_umap_embedding(db_feats)
snk2umap = dict(zip(fn,embedding))
  

now use widgets and make this into a "tool"

btn_run = widgets.Button(description='Find k-nearest neighbors')
out_pl = widgets.Output()
lbl_neighs = widgets.Label()
btn_upload = widgets.FileUpload()

def _load_image(im):
    """input: expects a Path(), but string should work, or a Bytestring
    
       returns: resized & squared image
    """
    #image = PILImage.create(btn_upload.data[-1])
    image = PILImage.create(im)
    #BUG: pass split_idx=1 to avoid funny business
    image = Resize(IMG_SIZE, method='pad', pad_mode='border')(image, split_idx=1) 
    return image

def _prep_image(image,to_cuda=False):
    """input: squared/resized PIL image
    
        output TensorImage ready to unsqueeze and "embed"
    TODO:  make this a Pipeline?

    """
    t2 = ToTensor()(image)
    t2 = IntToFloatTensor()(t2)
    t2 = torchvision.transforms.Normalize(*imagenet_stats)(t2)
    
    if to_cuda:
        device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    else:
        device = torch.device("cpu")
        
    return t2.to(device)




#img = _load_img(im).flip_lr()
  
conv_net = rnet

def on_click_find_similar(change):
    """ """
    
    im = btn_upload.data[-1]
    
    img = _load_image(im)
    tensor_im = _prep_image(img,to_cuda=False)

    feats = get_convnet_feature(conv_net, tensor_im )
    distance, nn_index = neighs.kneighbors(feats.numpy(), return_distance=True)    
    dist = distance.tolist()[0] 

    # fix path to the database...
    neighbors = df.iloc[nn_index.tolist()[0]].copy()
    #neighbors.loc[:,'db_path'] = neighbors.loc[:,'path'].astype(str).copy()

    nbr = neighbors.index
    
    
    out_pl.clear_output()
    #with out_pl: display(plot_sneak_neighs(img_row[np.newaxis,:]))  # need to convert to pil...

    
    images = [ PILImage.create(path_images/f) for f in neighbors.path] 

    #PILImage.create(btn_upload.data[-1])
    with out_pl:
        display(img.to_thumb(200,200))
        for i in images:
            display(i.to_thumb(100,100))
       
    lbl_neighs.value = f'distances: {dist}'   

    
btn_run.on_click(on_click_find_similar)




widgets.VBox([widgets.Label('Find your sneaker!'), 
      btn_upload, btn_run, out_pl, lbl_neighs])
# import time
# # import matplotlib.pyplot as pltmodel
# import matplotlib.image as mpimg
# import matplotlib.pyplot as plt
# from mpl_toolkits.mplot3d import Axes3D
# import plotly
# import plotly.express as px
# import plotly.figure_factory as FF
import bokeh.plotting as bplt #import figure, show, output_notebook
#from bokeh.models import HoverTool, ColumnDataSource, CategoricalColorMapper
import bokeh
# from bokeh.palettes import Spectral10

import umap

#from scipy import spatial  #for now just brute force to find neighbors
import scipy 
#from scipy.spatial import distance

from io import BytesIO
import base64



########################################3
#  BOKEH
#
##########################################3
def init_bokeh_plot(umap_df):

    bplt.output_notebook()

    datasource = bokeh.models.ColumnDataSource(umap_df)
    color_mapping = bokeh.models.CategoricalColorMapper(factors=["sns","goat"],
                                        palette=bokeh.palettes.Spectral10)

    plot_figure = bplt.figure(
        title='UMAP projection VAE latent',
        plot_width=1000,
        plot_height=1000,
        tools=('pan, wheel_zoom, reset')
    )

    plot_figure.add_tools(bokeh.models.HoverTool(tooltips="""
    <div>
        <div>
            <img src='@image' style='float: left; margin: 5px 5px 5px 5px'/>
        </div>
        <div>
            <span style='font-size: 14px'>@fname</span>
            <span style='font-size: 14px'>@loss</span>
        </div>
    </div>
    """))

    plot_figure.circle(
        'x',
        'y',
        source=datasource,
        color=dict(field='db', transform=color_mapping),
        line_alpha=0.6,
        fill_alpha=0.6,
        size=4
    )

    return plot_figure


def embeddable_image(label):
    return image_formatter(label)

def get_thumbnail(path):
    i = Image.open(path)
    i.thumbnail((64, 64), Image.LANCZOS)
    return i

def image_base64(im):
    if isinstance(im, str):
        im = get_thumbnail(im)
    with BytesIO() as buffer:
        im.save(buffer, 'png')
        return base64.b64encode(buffer.getvalue()).decode()

def image_formatter(im):
    return f"data:image/png;base64,{image_base64(im)}"



# do we need it loaded... it might be fast enough??
#@st.cache
def load_UMAP_data():
    data_dir = f"data/{model_name}-X{params['x_dim'][0]}-Z{params['z_dim']}"
    load_dir = os.path.join(data_dir,f"kl_weight{int(params['kl_weight']):03d}")
    snk2umap = ut.load_pickle(os.path.join(load_dir,"snk2umap.pkl"))
    
    return snk2umap


def load_latent_data():
    data_dir = f"data/{model_name}-X{params['x_dim'][0]}-Z{params['z_dim']}"
    snk2umap = load_UMAP_data()

    # load df (filenames and latents...)

    mids = list(snk2vec.keys())
    vecs = np.array([snk2vec[m] for m in mids])
    vec_tree = scipy.spatial.KDTree(vecs)


    latents = np.array(list(snk2vec.values()))
    losses = np.array(list(snk2loss.values()))
    labels = np.array(mids)

    labels2 = np.array(list(snk2umap.keys()))
    embedding = np.array(list(snk2umap.values()))

    assert(np.all(labels == labels2))    
    umap_df = pd.DataFrame(embedding, columns=('x', 'y'))

    umap_df['digit'] = [str(x.decode()) for x in labels]
    umap_df['image'] = umap_df.digit.map(lambda f: embeddable_image(f))
    umap_df['fname'] = umap_df.digit.map(lambda x: f"{x.split('/')[-3]} {x.split('/')[-1]}")
    umap_df['db'] = umap_df.digit.map(lambda x: f"{x.split('/')[-3]}")
    umap_df['loss'] = [f"{x:.1f}" for x in losses]

    return umap_df,snk2vec,latents, labels, vecs,vec_tree,mids


#%%
# pca_result = pca.fit_transform(df['feats'].values.tolist())
# df['pca-one'] = pca_result[:,0]
# df['pca-two'] = pca_result[:,1] 
# df['pca-three'] = pca_result[:,2]
# print('Explained variation per principal component: {}'.format(pca.explained_variance_ratio_))


# #data=df.sample(frac=1.0)
# #data=df.reindex(rndperm)
# data = df

# #df_subset = df

# time_start = time.time()
# tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)
# tsne_results = tsne.fit_transform(db_feats)
# print('t-SNE done! Time elapsed: {} seconds'.format(time.time()-time_start))



# df['tsne-2d-one'] = tsne_results[:,0]
# df['tsne-2d-two'] = tsne_results[:,1]
# plt.figure(figsize=(16,10))
# sns.scatterplot(
#     x="tsne-2d-one", y="tsne-2d-two",
#     hue="CategoryDir",
#     palette=sns.color_palette("hls", 4),
#     data=df,
#     legend="full",
#     alpha=0.3
#)
Explained variation per principal component: [0.04946684 0.03347368 0.03010488 0.02720879 0.02283907 0.02097444 0.01830503 0.01577074 0.01528187 0.014629  ]
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 33229 samples in 4.504s...
# import matplotlib.image as mpimg
# import random 
# from PIL import Image
# import requests
# from io import BytesIO

logisic regression on the mobilnet_v2 features

from sklearn.metrics import confusion_matrix
from seaborn import heatmap
from sklearn.linear_model import LogisticRegression
    
#Display Confusion Matrix
X_test = np.vstack(df[df.t_t_v=='test']['features'])
y_test = np.vstack(df[df.t_t_v=='test']['Category'])


X_train = np.vstack(df[df.t_t_v=='train']['features'])
y_train = np.vstack(df[df.t_t_v=='train']['Category'])


clf_log = LogisticRegression(C = 1, multi_class='ovr', max_iter=2000, solver='lbfgs')
clf_log.fit(X_train, y_train)
log_score = clf_log.score(X_test, y_test)
log_ypred = clf_log.predict(X_test)

log_confusion_matrix = confusion_matrix(y_test, log_ypred)
print(log_confusion_matrix)

disp = heatmap(log_confusion_matrix, annot=True, linewidths=0.5, cmap='Blues')
plt.savefig('log_Matrix.png')


plt.figure(figsize=(16,16))


# Plot non-normalized confusion matrix
titles_options = [("Confusion matrix, without normalization", None),
                  ("Normalized confusion matrix", 'true')]
class_names = df.Category.unique()

from sklearn.metrics import plot_confusion_matrix

for title, normalize in titles_options:
    disp = plot_confusion_matrix(clf_log, X_test, y_test,
                                 display_labels=class_names,
                                 cmap=plt.cm.Blues,
                                 normalize=normalize)
    disp.ax_.set_title(title)

    print(title)
    print(disp.confusion_matrix)

plt.savefig('log_Matrix2.png')
Confusion matrix, without normalization
[[1193   31    3   34]
 [  28 2061   11   69]
 [   5   28  102    5]
 [  32   88    3 1292]]
Normalized confusion matrix
[[0.94607454 0.02458366 0.00237906 0.02696273]
 [0.01290917 0.95020747 0.00507146 0.03181189]
 [0.03571429 0.2        0.72857143 0.03571429]
 [0.02261484 0.06219081 0.00212014 0.9130742 ]]

Part 3: full transfer learning. re-tune MobileNet_vt to classify my data

def get_x(r): return path_images/r['path']
def get_y(r): return r['Category']
def splitter(df):
    train = df.index[df['train']].tolist()
    valid = df.index[df['validate']].tolist()
    return train,valid
#                    splitter=RandomSplitter(valid_pct=0.3,seed=42),
#                    get_x=get_x, 
#                    get_y=get_y,
#                    #item_tfms=Resize(224)
#                    #item_tfms = RandomResizedCrop(224,min_scale=0.95)
#                   )
# dls = dblock.dataloaders(df)

doc(DataBlock)

imagenet_stats
([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
batch_tfms=Normalize.from_stats(*imagenet_stats)

tfms = aug_transforms(mult=1.0, 
               do_flip=True, 
               flip_vert=False, 
               max_rotate=5.0, 
               min_zoom=1.0, 
               max_zoom=1.05, 
               max_lighting=0.1, 
               max_warp=0.05, 
               p_affine=0.75, 
               p_lighting=0.0, 
               xtra_tfms=None, 
               size=None, 
               mode='bilinear', 
               pad_mode='reflection', 
               align_corners=True, 
               batch=False, 
               min_scale=1.0)

# put everythign in train, and don't do any augmentation since we are just going 
# resize to 160
dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                   splitter=splitter, 
                   get_x=get_x, 
                   get_y=get_y,
                   item_tfms=Resize(160,method='pad', pad_mode='border'),
                   batch_tfms=tfms)  # border pads white...
dls = dblock.dataloaders(df,bs=64,drop_last=False)
models.mobilenet_v2()._modules.items[1]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-32-dbe6e119450f> in <module>
----> 1 models.mobilenet_v2()._modules.items[1]

TypeError: 'builtin_function_or_method' object is not subscriptable
mobilenet_split = lambda m: (m[0][0][10], m[1])

learn = cnn_learner(dls, models.mobilenet_v2, splitter=mobilenet_split,cut=-1, pretrained=True,metrics=error_rate)
#learn = cnn_learner(dls, model_conv, splitter=mobilenet_split,cut=-1, pretrained=True)
lr_min,lr_steep = learn.lr_find()
lr_min, lr_steep
(0.00043651582673192023, 2.0892961401841603e-05)
doc(learn.fine_tune)
learn.predict(dls.dataset[10][0])


learn.fine_tune()
learn.fit_one_cycle(6, lr_max=1e-5)

learn.recorder.plot_loss()
epoch train_loss valid_loss time
0 2.521791 2.044415 00:48
1 2.600940 2.033086 00:49
2 2.579694 2.024415 00:48
3 2.568022 2.053141 00:49
4 2.577766 2.041003 00:48
5 2.585130 2.015271 00:49
model_conv = torchvision.models.mobilenet_v2(pretrained=True)
for param in model_conv.parameters():
    param.requires_grad = False

# Parameters of newly constructed modules have requires_grad=True by default
# just read this off: model_conv.classifier
num_categories = 4

num_ftrs = model_conv.classifier._modules['1'].in_features
model_conv.classifier._modules['1'] = nn.Linear(num_ftrs, num_categories)

def trns_mobilenet_v2():
    model_conv = torchvision.models.mobilenet_v2(pretrained=True)
    for param in model_conv.parameters():
        param.requires_grad = False
    # Parameters of newly constructed modules have requires_grad=True by default
    # just read this off: model_conv.classifier
    num_ftrs = model_conv.classifier._modules['1'].in_features
    model_conv.classifier._modules['1'] = nn.Linear(num_ftrs, num_categories)
    
    return model_conv



mnetV2 = torchvision.models.mobilenet_v2()
import torchvision
from torchvision import models

# def _mobilenetv2_split(m:nn.Module): 
#     return (m[0][0][10],m[1])(m:nn.Module): return (m[0][0][10],m[1])

mobilenet_split = lambda m: (m[0][0][10], m[1])
#arch  = torchvision.models.mobilenet_v2

model_conv  = models.mobilenet_v2(pretrained=True)

#learn = cnn_learner(dls, models.mobilenet_v2, cut=-1, pretrained=True)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model_conv = model_conv.to(device)